Add new AutoScaler and CustomScalerBase classes #1429

andrewlee94 · 2024-06-04T19:29:45Z

Summary/Motivation:

This PR adds a draft of the new Scaler classes along with some general utility functions for the new scaling interface. I have generally created new functions even if older ones existed to make a clean break from the older API and assist with backward compatibility.

Draft Documentation: Initial Outline of Documentation for Scaling Tools.docx

Changes proposed in this PR:

New ScalerBase class
New AutoScaler class
New CustomScalerBase class
Utility methods for manipulating scaling suffixes
Demonstration tests of new methods on Gibbs reactor

Legal Acknowledgement

By contributing to this software project, I agree to the following terms and conditions for my contribution:

I agree my contributions are submitted under the license terms described in the LICENSE.txt file at the top level of this directory.
I represent I am authorized to make the contributions and grant the license. If my employer has rights to intellectual property that includes these contributions, I represent that I have received permission to make contributions and grant the required license on behalf of that employer.

idaes/core/scaling/autoscaling.py

dallan-keylogic · 2024-06-04T22:04:37Z

idaes/core/scaling/autoscaling.py

+        jac, nlp = get_jacobian(blk, scaled=True)
+
+        if con_list is None:
+            con_list = nlp.get_pyomo_equality_constraints()


In my experience, get_jacobian is slow because forming the Pynumero NLP is slow. If we have flowsheet-level variables, we end up getting a Jacobian for the flowsheet and every subcomponent. I'm not sure how slow forming the NLP is going to be for PropertyBlock 43 in the MEA column, but you're still at least doubling the work you're doing.

I would suggest getting the Jacobian and NLP once at the beginning of constraints_by_jacobian_norm, then just referring to rows and columns of the overall object as you want to scale constraints.

I have been thinking about there best way to do this, and did realize I was probably getting the Jacobian more times than necessary. I was not aware of the issue of having higher-level components in the problem causing the entire problem to be emitted (and given how properties work that will be an issue). I am also thinking there might need to be some checks to deal with deactivated components (and that it might be possible to deactivate unnecessary components before getting the Jacobian to reduce the problem size).

I think I've fixed this so this method will only be called once for any call to the scaling method.

constraints_by_jacobian_norm (line 156) collects all the constraints we want to scale and the top level block, and then only calls the autoscaler once.

adam-a-a · 2024-06-05T14:20:55Z

idaes/core/scaling/autoscaling.py

+        # Use scipy to get all the norms
+        # Should be more efficient that iterating in Python
+        axis = (
+            1  # Could make this an argument to also support variable-based norm scaling


Based on your comment, should we consider adding another method for variable-based norm scaling?

I do not know if this makes sense or not - comments would be welcome (@dallan-keylogic ).

If we scale variables first based on order-of-magnitude, we can then scale constraints based on row/column norm. However, norm-based variable scaling probably won't be useful for a fully unscaled model, because the most common Jacobian entries will be O(1).

For a partially-scaled model, though, filling in variable scaling factors based on column norms might be useful.

adam-a-a · 2024-06-05T14:22:57Z

idaes/core/scaling/autoscaling.py

+
+
+@document_kwargs_from_configdict(CONFIG)
+class AutoScaler:


It'd be nice to have a method like report_scaling that would give you vars and constraints in one column, scaling factor values in the 2nd column, and the scaled value in the third column.

The first two you can get by doing suffix.display(). The third part - having scaled value is easy enough for Vars, but what is the "scaled value" for a Constraint? Scaled residual is easy to get, but not always that meaningful (as the scaled Jacobian is often more meaningful).

adam-a-a · 2024-06-05T14:33:03Z

idaes/models/unit_models/tests/test_gibbs.py

+        assert jacobian_cond(methane, scaled=True) == pytest.approx(9191, abs=1)

-        # Check for optimal solution
-        assert check_optimal_termination(results)
+        assert len(extreme_jacobian_rows(methane, scaled=True)) == 0
+        assert len(extreme_jacobian_rows(methane, scaled=True)) == 0


Actually, adding to my other suggestion on thinking about a reporting method for Autoscaler, these would also be good to show in report output (e.g., condition number, extreme jac rows/columns or at least the number of each).

I realize that the DiagnosticsToolbox already has the capability to report on some of these things, but just thinking it'd be nice to be able to apply scaling via AutoScaler and subsequently check what you really did to model scaling with a report method.

That is what the Diagnostics tools are for - putting those methods here would just be duplication of code (and the diagnostics also tell you a lot more). The eventual documentation will highlight that you really need to use diagnostics, scaling and initialization tools together to get the best results.

Will the diagnostics toolbox reference these new tools?

@agarciadiego I am not sure there - we probably need to think about how and where we draw the line between the two. Up until now, the Diagnostics Toolbox has been about finding the issues, and hasn't said much about how to fix them (as that is not so easy to explain concisely). We can definitely mention it in the docs, and maybe we could have the display methods mention tools to help fix issues (where appropriate).

…trol

…into autoscaling

docs/explanations/scaling_toolbox/applying_scaling_tools.rst

dallan-keylogic · 2024-10-04T15:44:14Z

docs/explanations/scaling_toolbox/diagnosing_scaling_issues.rst

+* large Jacobian Condition Numbers (>1e10),
+* variables with very large or very small values,
+* variables with values close to zero,
+* variables with values close to a bound (conditionally),
+* constraints with mismatched terms,
+* constraints with potential cancellations,
+* variables and constraints with extreme Jacobian norms,
+* extreme entries in Jacobian matrix (conditionally).


These are all issues conditionally, although extreme Jacobian entries is more conditional than the others.
I believe "constraints with potential cancellations" aren't typically scaling issues but model formulation issues. "variables with values close to zero" is redundant with "variables with very large or very small values". "Variables with values close to bounds" typically end up being variables close to zero. I don't recall an example with a variable close to a nonzero bound that turned out to be bad scaling.

You could rank order them:

Variables and constraints with extreme Jacobian column/row norms

Variables with very large or very small values

Gap

Variables close to bounds

Constraints with mismatched terms

Extreme Jacobian Entries

As a side note, an instructive exercise for constraints with mismatched terms is to imagine setting the mismatched term to zero. If the Jacobian is nonsingular, it's probably fine. If it becomes singular, it's probably an issue.

For this documentation I deliberately didn't go into too much detail. There is definitely room for a more detailed discussion of all these issues, but for the basic user I just wanted to give a list of things to look for.

docs/explanations/scaling_toolbox/diagnosing_scaling_issues.rst

docs/explanations/scaling_toolbox/index.rst

idaes/core/scaling/custom_scaler_base.py

dallan-keylogic · 2024-10-04T19:22:19Z

idaes/core/scaling/custom_scaler_base.py

+        Args:
+            model: parent model of submodel
+            submodel: name of submodel to be scaled as str
+            submodel_scalers: user provided dict of Scalers to use for submodels


We could use the actual submodel objects if we did a ComponentMap. I assume you're aware of the option and elected for strings and dictionaries for some performance reason.

I think the reason for this was that I was thinking we did not have the objects yet, but as this is a user-facing function (and not a set of defaults) that is not the case. I will update this too.

idaes/core/scaling/scaler_profiling.py

dallan-keylogic · 2024-10-04T21:27:07Z

idaes/core/scaling/util.py

+        sdict["block_datas"] = {}
+        for bd in blk_or_suffix.values():
+            sdict["block_datas"][bd.name] = _collect_block_suffixes(
+                bd, descend_into=descend_into
+            )


I was going to ask about what happens if multiple scaling suffixes for the same variable is encountered on different blocks, but it looks like the full block structure is recreated in this function.

For all of these tools, we assume that the IDAES Controlled Suffix will be on the parent block of the component. Pyomo does support putting scaling factors elsewhere, but that would require the user to manually set those (and thus they are responsible for managing them).

idaes/core/scaling/util.py

dallan-keylogic

Looks good besides some minor corrections and tweaks. The real test will be once we use it and once we start trying to get users to use it.

dallan-keylogic · 2024-10-07T14:22:38Z

idaes/core/scaling/util.py

+    if not overwrite and component in sfx:
+        _log.debug(
+            f"Existing scaling factor for {component.name} found and overwrite=False. "
+            "Scaling factor unchanged."
+        )


Is there a way to suppress debug messages like this for the scaling routines only? When I turn on Debug logging, it's typically to look at the IPOPT output of an initialization solve. I guess that involves passing the output level to the initialization object, so I imagine that the scaling routines take an outlevel argument in the same way.

Not quite - you should be able to control the logging level by each subpackage in IDAES. I.e. you can set different levels for idaes and idaes.core.scaling.

dallan-keylogic · 2024-10-07T14:29:18Z

idaes/core/scaling/util.py

+    if not isinstance(blk, (Block, BlockData)):
+        raise TypeError(
+            "report_scaling_factors: blk must be an instance of a Pyomo Block."
+        )


Are ConcreteModels instances of Blocks?

dallan-keylogic · 2024-10-07T15:18:06Z

idaes/core/scaling/util.py

+                stream.write(f"{n + ' ' * (maxname - len(n))}{TAB}{i}\n")
+
+
+def get_nominal_value(component):


This method is technically correct, but I'm not sure how useful this concept of a "nominal value" is going to be. We'll see how it gets used.

This is the underlying function that is used by the expression walker to get the nominal value of an expression/constraint. It is primarily intended to be used in the expression walker (or limited similar circumstances).

idaes/core/scaling/util.py

idaes/models/unit_models/tests/test_gibbs_scaling.py

dallan-keylogic · 2024-10-07T16:11:55Z

idaes/models/unit_models/tests/test_gibbs_scaling.py

+        scaled = jacobian_cond(methane, scaled=True)
+
+        assert scaled < unscaled
+        assert scaled == pytest.approx(6.96238e15, rel=1e-5)


Once this gets merged, I want to see what's causing the condition number to still be so badly behaved.

My guess is the fact that the model is uninitialized, but I gave it good initial guesses for the actual final state - i.e. there is a mismatch between the current uninitialized state and the scaling factors it was given. If we went and solved the model, I suspect the value would be much lower.

agarciadiego

Looks good to me, just a few typos. A couple of questions about TODO's or commented out code

docs/explanations/scaling_toolbox/applying_scaling_tools.rst

docs/explanations/scaling_toolbox/diagnosing_scaling_issues.rst

docs/explanations/scaling_toolbox/scaling_theory.rst

docs/explanations/scaling_toolbox/scaling_toolbox.rst

docs/explanations/scaling_toolbox/scaling_workflow.rst

idaes/core/scaling/custom_scaler_base.py

agarciadiego · 2024-10-08T15:04:07Z

idaes/core/scaling/custom_scaler_base.py

+        Args:
+            constraint: constraint to set scaling factor for
+            scheme: method to apply for determining constraint scaling
+              'harmonic_mean': (default) sum(1/abs(nominal value))


Not for this PR, but schemes could be explained more in the documentation

idaes/core/scaling/scaling_base.py

docs/explanations/scaling_toolbox/applying_scaling_tools.rst

docs/explanations/scaling_toolbox/diagnosing_scaling_issues.rst

docs/explanations/scaling_toolbox/scaling_theory.rst

docs/explanations/scaling_toolbox/scaling_workflow.rst

Co-authored-by: agarciadiego <[email protected]>

dallan-keylogic

LGTM

andrewlee94 self-assigned this Jun 4, 2024

dallan-keylogic reviewed Jun 4, 2024

View reviewed changes

adam-a-a reviewed Jun 5, 2024

View reviewed changes

ksbeattie added the Priority:Normal Normal Priority Issue or PR label Jun 6, 2024

andrewlee94 added 23 commits June 14, 2024 11:44

Adding infrastructure to support ipopt_v2

528a4f7

Moving core/util to ipopt_v2

0faabba

Moving MH initializer to ipopt_v2

47d9d73

Fixing pint version issue

47beac6

Set TSA to use old IPOPT interface

c39f256

Fixing conflict

04f1463

Trying to resolve Windows failures

3ce08dd

Working on platofrm dependent failure

4aa1405

BTInitializer with presolve

7353472

Merge branch 'main' of https://github.com/IDAES/idaes-pse into presolve

e88bd83

Moving last bits of core code to ipopt_v2

d5135f4

Starting on idaes/models

9ccc8b4

Removing ma57_automatic_scaling default and updating idaes/models/con…

9423501

…trol

idaes/model/properties part 1

6754964

Remaining parts of idaes/models/proeprties

f6079ca

Fixing typo

ad475fd

Switching idaes/models/unit_models to ipopt_v2

dda16b3

Attempt to work around HXLC issues for now

bc8e411

Some clean up

4273e24

Switching modular properties initializer to solver indexed blocks

542914a

Merge branch 'modular_properties_init' into presolve

6578d01

Addressing comments

84e9d69

Recovering from previous branch

c3615a1

andrewlee94 force-pushed the autoscaling branch from fab2a38 to 8be4d5a Compare June 21, 2024 17:07

Merge branch 'main' into autoscaling

3b6ae69

andrewlee94 marked this pull request as ready for review September 30, 2024 14:49

andrewlee94 requested review from bpaul4 and lbianchi-lbl as code owners September 30, 2024 14:49

Merge branch 'autoscaling' of https://github.com/andrewlee94/idaes-pse …

f3940a1

…into autoscaling

andrewlee94 requested a review from dangunter as a code owner September 30, 2024 20:30

andrewlee94 added 5 commits September 30, 2024 16:31

Starting on docs

c71fbd1

Finishing docs and unifiying method names

10fbf3d

Profiler report methods and docs

794f1ed

Fixing typos

21ef587

Pylint: fix unnecessary f-string

2d1dd54

ksbeattie mentioned this pull request Oct 3, 2024

Report Numerical Issues does not issue a warning for a singular Jacobian #1437

Open

dallan-keylogic reviewed Oct 5, 2024

View reviewed changes

dallan-keylogic approved these changes Oct 7, 2024

View reviewed changes

Addressing comments

0ba6508

agarciadiego approved these changes Oct 8, 2024

View reviewed changes